Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 20000 |
| Missing cells | 7344 |
| Missing cells (%) | 2.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.3 MiB |
| Average record size in memory | 120.0 B |
Variable types
| NUM | 8 |
|---|---|
| CAT | 6 |
| BOOL | 1 |
NOME has a high cardinality: 19867 distinct values | High cardinality |
NOTA_EM is highly correlated with NOTA_DE | High correlation |
NOTA_DE is highly correlated with NOTA_EM | High correlation |
NOTA_GO is highly correlated with NOTA_MF | High correlation |
NOTA_MF is highly correlated with NOTA_GO | High correlation |
NOTA_GO has 3716 (18.6%) missing values | Missing |
INGLES has 3628 (18.1%) missing values | Missing |
NOME is uniformly distributed | Uniform |
NOTA_DE has 3575 (17.9%) zeros | Zeros |
NOTA_EM has 3584 (17.9%) zeros | Zeros |
NOTA_MF has 4331 (21.7%) zeros | Zeros |
NOTA_GO has 3537 (17.7%) zeros | Zeros |
H_AULA_PRES has 657 (3.3%) zeros | Zeros |
TAREFAS_ONLINE has 2204 (11.0%) zeros | Zeros |
Reproduction
| Analysis started | 2020-10-02 00:34:37.036361 |
|---|---|
| Analysis finished | 2020-10-02 00:34:53.751408 |
| Duration | 16.72 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
MATRICULA
Real number (ℝ≥0)
| Distinct | 19770 |
|---|---|
| Distinct (%) | 98.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 551148.2714 |
|---|---|
| Minimum | 100003 |
| Maximum | 999995 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 100003 |
|---|---|
| 5-th percentile | 145747.35 |
| Q1 | 326554.25 |
| median | 550630 |
| Q3 | 775524.75 |
| 95-th percentile | 956802.5 |
| Maximum | 999995 |
| Range | 899992 |
| Interquartile range (IQR) | 448970.5 |
Descriptive statistics
| Standard deviation | 259488.7666 |
|---|---|
| Coefficient of variation (CV) | 0.4708148062 |
| Kurtosis | -1.192952976 |
| Mean | 551148.2714 |
| Median Absolute Deviation (MAD) | 224464.5 |
| Skewness | 0.007501116468 |
| Sum | 1.102296543e+10 |
| Variance | 6.733441998e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 751223 | 3 | < 0.1% | |
| 657605 | 3 | < 0.1% | |
| 390727 | 2 | < 0.1% | |
| 249902 | 2 | < 0.1% | |
| 894414 | 2 | < 0.1% | |
| 504812 | 2 | < 0.1% | |
| 761243 | 2 | < 0.1% | |
| 867004 | 2 | < 0.1% | |
| 147715 | 2 | < 0.1% | |
| 668008 | 2 | < 0.1% | |
| Other values (19760) | 19978 | 99.9% |
| Value | Count | Frequency (%) | |
| 100003 | 1 | < 0.1% | |
| 100004 | 1 | < 0.1% | |
| 100022 | 1 | < 0.1% | |
| 100058 | 1 | < 0.1% | |
| 100118 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 999995 | 1 | < 0.1% | |
| 999985 | 1 | < 0.1% | |
| 999921 | 1 | < 0.1% | |
| 999911 | 1 | < 0.1% | |
| 999897 | 1 | < 0.1% |
| Distinct | 19867 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| Maria da Silva | 5 |
|---|---|
| Melissa de Souza | 4 |
| Maria dos Santos | 4 |
| Kevin da Silva | 3 |
| Kiara de Barbosa | 3 |
| Other values (19862) |
| Value | Count | Frequency (%) | |
| Maria da Silva | 5 | < 0.1% | |
| Melissa de Souza | 4 | < 0.1% | |
| Maria dos Santos | 4 | < 0.1% | |
| Kevin da Silva | 3 | < 0.1% | |
| Kiara de Barbosa | 3 | < 0.1% | |
| Bento da Silva | 3 | < 0.1% | |
| Joyce da Silva | 3 | < 0.1% | |
| Simone da Silva | 3 | < 0.1% | |
| Ana de Andrade | 3 | < 0.1% | |
| Eunice da Silva | 3 | < 0.1% | |
| Other values (19857) | 19966 | 99.8% |
Frequencies of value counts
Unique
| Unique | 19749 ? |
|---|---|
| Unique (%) | 98.7% |
Histogram of lengths of the category
Length
| Max length | 57 |
|---|---|
| Median length | 24 |
| Mean length | 24.28435 |
| Min length | 8 |
REPROVACOES_DE
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| 0 | |
|---|---|
| 1 | |
| 3 | 648 |
| Value | Count | Frequency (%) | |
| 0 | 16439 | 82.2% | |
| 1 | 2913 | 14.6% | |
| 3 | 648 | 3.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
REPROVACOES_EM
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| 0 | |
|---|---|
| 1 | |
| 3 | 648 |
| Value | Count | Frequency (%) | |
| 0 | 16439 | 82.2% | |
| 1 | 2913 | 14.6% | |
| 3 | 648 | 3.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
REPROVACOES_MF
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| 0 | |
|---|---|
| 1 | |
| 3 | 812 |
| Value | Count | Frequency (%) | |
| 0 | 15671 | 78.4% | |
| 1 | 3517 | 17.6% | |
| 3 | 812 | 4.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
REPROVACOES_GO
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| 0 | |
|---|---|
| 1 | |
| 3 | 769 |
| Value | Count | Frequency (%) | |
| 0 | 15671 | 78.4% | |
| 1 | 3560 | 17.8% | |
| 3 | 769 | 3.8% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 50 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.19656 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 3575 |
| Zeros (%) | 17.9% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5.2 |
| median | 6.2 |
| Q3 | 6.7 |
| 95-th percentile | 7.5 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 2.522544812 |
|---|---|
| Coefficient of variation (CV) | 0.4854258994 |
| Kurtosis | 0.3543124521 |
| Mean | 5.19656 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | -1.392679723 |
| Sum | 103931.2 |
| Variance | 6.363232328 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3575 | 17.9% | |
| 6.4 | 932 | 4.7% | |
| 6.5 | 929 | 4.6% | |
| 6.3 | 887 | 4.4% | |
| 6.6 | 846 | 4.2% | |
| 6.2 | 842 | 4.2% | |
| 6.7 | 842 | 4.2% | |
| 6.1 | 802 | 4.0% | |
| 6.8 | 717 | 3.6% | |
| 6.9 | 708 | 3.5% | |
| Other values (40) | 8920 | 44.6% |
| Value | Count | Frequency (%) | |
| 0 | 3575 | 17.9% | |
| 4 | 3 | < 0.1% | |
| 4.1 | 20 | 0.1% | |
| 4.2 | 46 | 0.2% | |
| 4.3 | 47 | 0.2% |
| Value | Count | Frequency (%) | |
| 9 | 1 | < 0.1% | |
| 8.8 | 1 | < 0.1% | |
| 8.6 | 4 | < 0.1% | |
| 8.5 | 5 | < 0.1% | |
| 8.4 | 5 | < 0.1% |
| Distinct | 57 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.080285 |
|---|---|
| Minimum | 0 |
| Maximum | 9.4 |
| Zeros | 3584 |
| Zeros (%) | 17.9% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4.9 |
| median | 5.9 |
| Q3 | 6.7 |
| 95-th percentile | 7.6 |
| Maximum | 9.4 |
| Range | 9.4 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 2.523928155 |
|---|---|
| Coefficient of variation (CV) | 0.4968083788 |
| Kurtosis | 0.1526173614 |
| Mean | 5.080285 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -1.236398587 |
| Sum | 101605.7 |
| Variance | 6.370213329 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3584 | 17.9% | |
| 6.2 | 627 | 3.1% | |
| 6.7 | 627 | 3.1% | |
| 6.8 | 626 | 3.1% | |
| 5.9 | 616 | 3.1% | |
| 6 | 604 | 3.0% | |
| 6.3 | 592 | 3.0% | |
| 6.5 | 591 | 3.0% | |
| 6.9 | 584 | 2.9% | |
| 6.1 | 577 | 2.9% | |
| Other values (47) | 10972 | 54.9% |
| Value | Count | Frequency (%) | |
| 0 | 3584 | 17.9% | |
| 3.9 | 2 | < 0.1% | |
| 4 | 20 | 0.1% | |
| 4.1 | 44 | 0.2% | |
| 4.2 | 80 | 0.4% |
| Value | Count | Frequency (%) | |
| 9.4 | 1 | < 0.1% | |
| 9.3 | 1 | < 0.1% | |
| 9.2 | 2 | < 0.1% | |
| 9.1 | 1 | < 0.1% | |
| 9 | 9 | < 0.1% |
| Distinct | 69 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.81763 |
|---|---|
| Minimum | 0 |
| Maximum | 11.5 |
| Zeros | 4331 |
| Zeros (%) | 21.7% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4.7 |
| median | 5.5 |
| Q3 | 6.5 |
| 95-th percentile | 8.2 |
| Maximum | 11.5 |
| Range | 11.5 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 2.734775335 |
|---|---|
| Coefficient of variation (CV) | 0.567659894 |
| Kurtosis | -0.4579879814 |
| Mean | 4.81763 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -0.8262907215 |
| Sum | 96352.6 |
| Variance | 7.478996133 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 4331 | 21.7% | |
| 5 | 670 | 3.4% | |
| 5.2 | 665 | 3.3% | |
| 5.1 | 665 | 3.3% | |
| 5.3 | 655 | 3.3% | |
| 5.6 | 650 | 3.2% | |
| 5.4 | 648 | 3.2% | |
| 5.5 | 636 | 3.2% | |
| 5.7 | 596 | 3.0% | |
| 5.9 | 591 | 3.0% | |
| Other values (59) | 9893 | 49.5% |
| Value | Count | Frequency (%) | |
| 0 | 4331 | 21.7% | |
| 4.5 | 152 | 0.8% | |
| 4.6 | 363 | 1.8% | |
| 4.7 | 390 | 1.9% | |
| 4.8 | 455 | 2.3% |
| Value | Count | Frequency (%) | |
| 11.5 | 1 | < 0.1% | |
| 11.4 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 10.9 | 3 | < 0.1% | |
| 10.8 | 8 | < 0.1% |
| Distinct | 56 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 3716 |
| Missing (%) | 18.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.534100958 |
|---|---|
| Minimum | 0 |
| Maximum | 10 |
| Zeros | 3537 |
| Zeros (%) | 17.7% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4.5 |
| median | 5.4 |
| Q3 | 6.2 |
| 95-th percentile | 7.2 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1.7 |
Descriptive statistics
| Standard deviation | 2.509209393 |
|---|---|
| Coefficient of variation (CV) | 0.5534083639 |
| Kurtosis | -0.4288508308 |
| Mean | 4.534100958 |
| Median Absolute Deviation (MAD) | 0.8 |
| Skewness | -1.02567161 |
| Sum | 73833.3 |
| Variance | 6.296131778 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3537 | 17.7% | |
| 5.3 | 603 | 3.0% | |
| 5.5 | 576 | 2.9% | |
| 5.1 | 575 | 2.9% | |
| 5.2 | 574 | 2.9% | |
| 5.4 | 546 | 2.7% | |
| 5.6 | 542 | 2.7% | |
| 5 | 535 | 2.7% | |
| 5.8 | 517 | 2.6% | |
| 5.7 | 508 | 2.5% | |
| Other values (46) | 7771 | 38.9% | |
| (Missing) | 3716 | 18.6% |
| Value | Count | Frequency (%) | |
| 0 | 3537 | 17.7% | |
| 4.1 | 6 | < 0.1% | |
| 4.2 | 70 | 0.4% | |
| 4.3 | 147 | 0.7% | |
| 4.4 | 195 | 1.0% |
| Value | Count | Frequency (%) | |
| 10 | 1 | < 0.1% | |
| 9.6 | 1 | < 0.1% | |
| 9.5 | 1 | < 0.1% | |
| 9.4 | 1 | < 0.1% | |
| 9.2 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3628 |
| Missing (%) | 18.1% |
| Memory size | 156.2 KiB |
| 1 | |
|---|---|
| 0 | |
| (Missing) |
| Value | Count | Frequency (%) | |
| 1 | 10581 | 52.9% | |
| 0 | 5791 | 29.0% | |
| (Missing) | 3628 | 18.1% |
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.10295 |
|---|---|
| Minimum | 0 |
| Maximum | 25 |
| Zeros | 657 |
| Zeros (%) | 3.3% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 14 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 4.118421951 |
|---|---|
| Coefficient of variation (CV) | 0.8070668831 |
| Kurtosis | 3.833488855 |
| Mean | 5.10295 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.850651561 |
| Sum | 102059 |
| Variance | 16.96139937 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=26)
| Value | Count | Frequency (%) | |
| 3 | 3855 | 19.3% | |
| 2 | 3723 | 18.6% | |
| 5 | 2981 | 14.9% | |
| 4 | 2847 | 14.2% | |
| 6 | 709 | 3.5% | |
| 7 | 706 | 3.5% | |
| 1 | 675 | 3.4% | |
| 8 | 669 | 3.3% | |
| 0 | 657 | 3.3% | |
| 9 | 628 | 3.1% | |
| Other values (16) | 2550 | 12.8% |
| Value | Count | Frequency (%) | |
| 0 | 657 | 3.3% | |
| 1 | 675 | 3.4% | |
| 2 | 3723 | 18.6% | |
| 3 | 3855 | 19.3% | |
| 4 | 2847 | 14.2% |
| Value | Count | Frequency (%) | |
| 25 | 40 | 0.2% | |
| 24 | 43 | 0.2% | |
| 23 | 40 | 0.2% | |
| 22 | 29 | 0.1% | |
| 21 | 33 | 0.2% |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.1403 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 2204 |
| Zeros (%) | 11.0% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.854909147 |
|---|---|
| Coefficient of variation (CV) | 0.5906789629 |
| Kurtosis | -0.9080523168 |
| Mean | 3.1403 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.01568274405 |
| Sum | 62806 |
| Variance | 3.440687944 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 2 | 5141 | 25.7% | |
| 5 | 4892 | 24.5% | |
| 3 | 2627 | 13.1% | |
| 4 | 2386 | 11.9% | |
| 0 | 2204 | 11.0% | |
| 1 | 1285 | 6.4% | |
| 6 | 901 | 4.5% | |
| 7 | 564 | 2.8% |
| Value | Count | Frequency (%) | |
| 0 | 2204 | 11.0% | |
| 1 | 1285 | 6.4% | |
| 2 | 5141 | 25.7% | |
| 3 | 2627 | 13.1% | |
| 4 | 2386 | 11.9% |
| Value | Count | Frequency (%) | |
| 7 | 564 | 2.8% | |
| 6 | 901 | 4.5% | |
| 5 | 4892 | 24.5% | |
| 4 | 2386 | 11.9% | |
| 3 | 2627 | 13.1% |
FALTAS
Real number (ℝ≥0)
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.0606 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 156.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.674714266 |
|---|---|
| Coefficient of variation (CV) | 0.4124302483 |
| Kurtosis | -0.5736840042 |
| Mean | 4.0606 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.3731042662 |
| Sum | 81212 |
| Variance | 2.804667873 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 3 | 7043 | 35.2% | |
| 6 | 3579 | 17.9% | |
| 4 | 2866 | 14.3% | |
| 5 | 2454 | 12.3% | |
| 2 | 1670 | 8.3% | |
| 1 | 963 | 4.8% | |
| 7 | 828 | 4.1% | |
| 8 | 597 | 3.0% |
| Value | Count | Frequency (%) | |
| 1 | 963 | 4.8% | |
| 2 | 1670 | 8.3% | |
| 3 | 7043 | 35.2% | |
| 4 | 2866 | 14.3% | |
| 5 | 2454 | 12.3% |
| Value | Count | Frequency (%) | |
| 8 | 597 | 3.0% | |
| 7 | 828 | 4.1% | |
| 6 | 3579 | 17.9% | |
| 5 | 2454 | 12.3% | |
| 4 | 2866 | 14.3% |
PERFIL
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 156.2 KiB |
| EXATAS | |
|---|---|
| DIFICULDADE | |
| HUMANAS | |
| MUITO_BOM | |
| EXCELENTE | 671 |
| Value | Count | Frequency (%) | |
| EXATAS | 8230 | 41.1% | |
| DIFICULDADE | 7001 | 35.0% | |
| HUMANAS | 3196 | 16.0% | |
| MUITO_BOM | 902 | 4.5% | |
| EXCELENTE | 671 | 3.4% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 8.146 |
| Min length | 6 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| MATRICULA | NOME | REPROVACOES_DE | REPROVACOES_EM | REPROVACOES_MF | REPROVACOES_GO | NOTA_DE | NOTA_EM | NOTA_MF | NOTA_GO | INGLES | H_AULA_PRES | TAREFAS_ONLINE | FALTAS | PERFIL | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 502375 | Márcia Illiglener | 0 | 0 | 0 | 0 | 6.2 | 5.8 | 4.6 | 5.9 | 0.0 | 2 | 4 | 3 | EXATAS |
| 1 | 397093 | Jason Jytereoman Izoimum | 0 | 0 | 0 | 0 | 6.0 | 6.2 | 5.2 | 4.5 | 1.0 | 2 | 4 | 3 | EXATAS |
| 2 | 915288 | Bartolomeu Inácio da Gama | 0 | 0 | 0 | 0 | 7.3 | 6.7 | 7.1 | 7.2 | 0.0 | 5 | 0 | 3 | HUMANAS |
| 3 | 192652 | Fernanda Guedes | 1 | 3 | 1 | 1 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 4 | 4 | 4 | DIFICULDADE |
| 4 | 949491 | Alessandre Borba Gomes | 1 | 3 | 1 | 1 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 5 | 2 | 5 | DIFICULDADE |
| 5 | 627360 | Magali Hellen Gejibaflião | 0 | 0 | 0 | 0 | 7.3 | 7.4 | 7.6 | 6.5 | 1.0 | 5 | 3 | 5 | HUMANAS |
| 6 | 804493 | Tiago Brisu Pires | 0 | 0 | 0 | 0 | 5.8 | 6.0 | 7.3 | 5.1 | 1.0 | 5 | 2 | 6 | DIFICULDADE |
| 7 | 433789 | Andressa Gabrielle da Silva | 0 | 0 | 0 | 0 | 4.9 | 5.0 | 5.9 | 4.6 | NaN | 2 | 2 | 6 | DIFICULDADE |
| 8 | 178335 | Gilmar Oséas Etonvic | 0 | 0 | 0 | 0 | 4.4 | 4.8 | 4.7 | 4.6 | 1.0 | 3 | 4 | 4 | DIFICULDADE |
| 9 | 987229 | Otávia Mônica Noopu | 0 | 0 | 0 | 0 | 6.4 | 5.4 | 5.0 | 5.5 | 1.0 | 3 | 5 | 3 | EXATAS |
Last rows
| MATRICULA | NOME | REPROVACOES_DE | REPROVACOES_EM | REPROVACOES_MF | REPROVACOES_GO | NOTA_DE | NOTA_EM | NOTA_MF | NOTA_GO | INGLES | H_AULA_PRES | TAREFAS_ONLINE | FALTAS | PERFIL | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19990 | 170643 | Celso Eric da Lira | 1 | 1 | 1 | 1 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 2 | 4 | 5 | DIFICULDADE |
| 19991 | 259981 | Maria Charlene Dadamu | 0 | 0 | 0 | 0 | 5.5 | 5.2 | 5.5 | 4.3 | 1.0 | 4 | 5 | 1 | EXATAS |
| 19992 | 991838 | Lucas de Drummond Gamdoz | 0 | 0 | 0 | 0 | 6.3 | 5.6 | 5.7 | 5.4 | 1.0 | 2 | 2 | 3 | EXATAS |
| 19993 | 489491 | Zenaide Dace de Britto | 0 | 0 | 3 | 1 | 6.4 | 6.1 | 0.0 | 0.0 | 1.0 | 7 | 2 | 6 | DIFICULDADE |
| 19994 | 876548 | Nair Jeniffer da Silva Dias | 1 | 1 | 1 | 1 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 20 | 4 | 4 | DIFICULDADE |
| 19995 | 856673 | Laércio Mário da Silva | 0 | 0 | 0 | 0 | 7.0 | 7.9 | 5.8 | 7.0 | 1.0 | 9 | 5 | 6 | EXATAS |
| 19996 | 576100 | Cibele Quésia Poza | 1 | 1 | 1 | 1 | 0.0 | 0.0 | 0.0 | NaN | 1.0 | 3 | 2 | 5 | DIFICULDADE |
| 19997 | 888739 | Marcielle Chale Bape | 0 | 0 | 0 | 0 | 7.9 | 7.6 | 8.3 | 7.2 | NaN | 8 | 3 | 1 | EXCELENTE |
| 19998 | 722743 | Suzanne Mirian Mourão | 0 | 0 | 1 | 1 | 6.3 | 5.1 | 0.0 | 0.0 | 1.0 | 3 | 2 | 6 | DIFICULDADE |
| 19999 | 417268 | Maria Isaiane da Silva Luwequisman | 0 | 0 | 1 | 1 | 7.0 | 7.3 | 0.0 | 0.0 | NaN | 3 | 0 | 6 | DIFICULDADE |